Access Ordering Algorithms for an Interleaved Memory
نویسنده
چکیده
Superscalar processors are well suited for meeting the demands of scientific computing, given sufficient memory bandwidth. Employing parallel memory modules increases the bandwidth available; however , storage schemes devised to reduce module conflict for vector computers are not suitable for scalar computation. Access ordering is a compilation technique that increases effective bandwidth by reordering references to exploit the underlying memory system. Ordering algorithms are derived in this report for a sequentially interleaved memory architecture. Memory system parameters: word size page size page-hit read cycle time page-hit write cycle time page-miss overhead uniform-access read cycle time uniform-access write cycle time Stream parameters: stream start address (vector accessed) stride of access data size mode of access number of data items referenced per functional iteration MAP notation: access to the next element of stream access from for a given access sequence iteration set of all streams in a given MAP number of streams in number of different vectors referenced by streams in depth of loop unrolling Performance measures: average time per access processor-memory bandwidth w p T p/ r T p/ w T p/ m T u/ r T u/ w v s d m σ a i t i a i k k th t i S N S V S b T avg BW iv General properties of stream : number of accesses per loop iteration intermix factor Properties of stream for a sequentially interleaved architecture: number of modules referenced set of modules referenced module stride maximum number of accesses serviced at any module for a given iteration Modeling functions: average number of data items per word average number of data items per page average per iteration page miss count average per iteration page miss count for intermixed write stream average per iteration page miss count for wrap-around adjacent read stream effect of intermixing on average page miss count of write stream effect of wrap-around adjacency on page miss count of read stream t i ε i
منابع مشابه
Morton ordering of 2D arrays for efficient access to hierarchical memory
This paper investigates the recursive Morton ordering of two-dimensional arrays as an efficient way to access hierarchical memory across a range of heterogeneous computer platforms, ranging from many-core devices, multicore processor, clusters, and distributed environments. A brief overview of previous research in this area is given, and algorithms that make use of Morton ordering are described...
متن کاملPartition the Banks, not the Functionality, of Large-Window Load/Store Queues
Designing scalable memory ordering hardware is one of the most important challenges for large-window, out-of-order processor design, due to its complexity, power, and its criticality for high performance. Recent research has aimed to partition the functionality of load/store queues (LSQ) into three components: ordering violations detection, value forwarding, and store buffering for commit, to a...
متن کاملLow Power March Memory Test Algorithm for Static Random Access Memories (TECHNICAL NOTE)
Memories are most important building blocks in many digital systems. As the Integrated Circuits requirements are growing, the test circuitry must grow as well. There is a need for more efficient test techniques with low power and high speed. Many Memory Built in Self-Test techniques have been proposed to test memories. Compared with combinational and sequential circuits memory testing utilizes ...
متن کاملAccess Ordering and Effective Memory Bandwidth
High-performance scalar processors are characterized by multiple pipelined functional units that can be initiated simultaneously to exploit instruction level parallelism. For scientific codes, the performance of these processors depends heavily on memory bandwidth. To achieve peak processor rate, data must be supplied to the arithmetic units at the peak aggregate rate of consumption. Access ord...
متن کاملStreaming Algorithm for Determining a Topological Ordering of a Digraph
Finding a topological ordering for a directed graph is one of the fundamental problems in computer science. Several textbook-standard algorithms using linear memory have been discovered and utilized to solve several other problems for many decades, especially in resolving dependencies and solving other graph connectivity problems. However, these algorithms are becoming less practical nowadays a...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 1992